Impact of public sentiments on the transmission of COVID-19 across a geographical gradient

COVID-19 is a respiratory disease caused by a recently discovered, novel coronavirus, SARS-COV-2. The disease has led to over 81 million confirmed cases of COVID-19, with close to two million deaths. In the current social climate, the risk of COVID-19 infection is driven by individual and public perception of risk and sentiments. A number of factors influences public perception, including an individual’s belief system, prior knowledge about a disease and information about a disease. In this article, we develop a model for COVID-19 using a system of ordinary differential equations following the natural history of the infection. The model uniquely incorporates social behavioral aspects such as quarantine and quarantine violation. The model is further driven by people’s sentiments (positive and negative) which accounts for the influence of disinformation. People’s sentiments were obtained by parsing through and analyzing COVID-19 related tweets from Twitter, a social media platform across six countries. Our results show that our model incorporating public sentiments is able to capture the trend in the trajectory of the epidemic curve of the reported cases. Furthermore, our results show that positive public sentiments reduce disease burden in the community. Our results also show that quarantine violation and early discharge of the infected population amplifies the disease burden on the community. Hence, it is important to account for public sentiment and individual social behavior in epidemic models developed to study diseases like COVID-19.


INTRODUCTION
COVID-19 is caused by a coronavirus called the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Coronaviruses are a large family of viruses that are common in humans and many different species of animals, including camels, cattle, cats, and bats (Centers for Disease Control and Prevention, 2020a;WHO, 2020d). This virus was discovered in Wuhan China, in 2019, and has since been declared a pandemic by the World Health Organization (WHO). As of December 31, 2020, there were over 81 million measures depends on voluntary compliance by the population , and may depend in part to perceptions and interpretations of risk.
The response of individuals in the community to the threat of an infectious disease is dependent on their perception of risk, which can be swayed by public and private information disseminated through diverse media. Many individuals use social media platforms like Twitter, Facebook, and the internet more generally to share social and health information, and many have used these platforms to also spread misinformation and conspiracy theories. Many health-related organizations also use these platforms to send information to mitigate the spread of contagious diseases (like the flu) by educating users on the effectiveness of regular hand-washing, use of face masks, social distancing, and raising awareness about vaccines (Philipose, 2020). For instance, in the past decade, the Centers for Disease Control made use of Twitter in disseminating information on the prevention of flu to help curb the spread of H1N1 influenza in 2009 (Philipose, 2020). Media reporting is important in the perception, management and even creation of crises (Marino et al., 2009;Tchuenche et al., 2011). Information provided to the public through the media changes human behavior and the population adopts the precautionary measures like the use of face masks for influenza (Jenco, 2020), vaccination (Aminiel, Kajunguri & Mpolya, 2015;Buonomo, d'Onofrio & Lacitignola, 2008), and voluntary quarantine (Hethcote, Ma & Shengbing, 2002). Thus, the role of media coverage and social media responses on disease outbreaks is crucial and should be given prominence in the study of disease dynamics.
Numerous mathematical models have been used to gain insight into the effect of media and behavioural change on COVID-19 transmission dynamics. A SEIQR-type compartmental model was developed in (Feng et al., 2020) to assess the impact of media coverage and quarantine on the COVID-19 infections in the UK. The study showed that stringent containment strategies should be adopted in the UK in order to effectively curtail the spread of the disease. Aleta et al. (2020) used a stochastic model to understand the impact of testing, contact tracing and household quarantine on second waves of COVID-19 in the Boston metropolitan area. Their result showed that a response system based on enhanced testing and contact tracing can have a major role in relaxing social-distancing interventions in the absence of herd immunity against COVID-19. Eikenberry et al. (2020) developed a compartmental model to assess the community-wide impact of mask use by the general asymptomatic public. The study showed that broad adoption of even relatively ineffective face masks could reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. A mathematical model was developed in Iboi et al. (2021) to assess the impact of a public health education program on the coronavirus outbreak in the United States. Their result suggests the need to obey public health measures as loss of willingness would increase the cumulative and daily mortality in the United States.
Our objective in this study is to gain insight into the contribution of human behavior and public sentiment to the disease spread and not to make explicit epidemiological predictions and forecasting about the disease outbreak. Here, we use tweets as a source of public sentiment data and analyze their average parity (i.e., negativeness and positiveness) across six countries experiencing variations in disease response and spread (Australia, Brazil, Italy, South Africa, United Kingdom, and United States) during January to May, 2020 time period. Tweets are short messages limited to 240 characters that are posted in real time by users of the Twitter social media platform.
The remainder of the work in this article is organized as follows. In Section 2, we formulate our baseline COVID-19 model with human behavior, compute the basic reproduction number of the model, fit COVID-19 data to the model, and estimate parameters of the model. In Section 3, we carry out sensitivity analysis of the basic reproduction number with respect to each parameter, and sentiment analysis is carried out in Section 4. In Section 5, we incorporate sentiment effects in our basic model, and present results in Section 6. Our discussion and conclusions are presented in Section 7.

BASELINE COVID-19 MODEL
To formulate the COVID-19 model with human behavior where some individuals violate quarantine rules, we followed the natural history of the infection (Picheta, 2020;Wilson & Kluger, 2020) and segment the population according to their disease status as susceptible (SðtÞ), exposed (EðtÞ), asymptomatic (AðtÞ), symptomatic (IðtÞ), quarantined (QðtÞ), hospitalized (HðtÞ), and removed (RðtÞ). The equations of the mathematical model are given in Eq. (1). A flow diagram depicting the transition from one state to the other as the disease progresses through the population is shown in Fig. 1, and the associated state variables and parameters are described in Table 1.
The population of susceptible (SðtÞ) is decreased by infection at the rate b½IðtÞþg A AðtÞþg Q QðtÞþg H HðtÞSðtÞ NðtÞ , where b is the infection rate; we assume that g A ; g Q ; g H , 1, meaning that the asymptomatic, quarantined, and hospitalized are not as infectious as the symptomatic individuals. Once infected, the susceptible move into the exposed class (EðtÞ) and a portion of the exposed population develops clinical symptoms of the disease at the rate ð1 À qÞr and move into the infectious class (IðtÞ), while the remaining proportion shows no symptoms and moves into the asymptomatic class (AðtÞ) at the rate qr.
The symptomatic individuals either are quarantined at the rate x Q or are hospitalized at the rate x H . There have been several reports of people flouting mandatory quarantine rules (Choi, 2020;Crane, 2020;Frias, 2020;Neuman, 2020), so we assume that individuals in quarantine violate the quarantine rules/laws at the rate m Q . The alarming rate at which the disease spreads and people require hospitalization, hospitals may become overwhelmed and could run out of beds, respirators, ventilators, and ICUs (Starleaf Riker & Chasnoff, 2020). Furthermore, some hospitals are reserving beds for critically ill COVID-19 patients and discharging to nursing homes those with less severe illness (Baker & Fink, 2020;Graham, 2020). Thus, we assume that, due to limitations in hospital beds, respirators, ventilators, and ICUs, some hospitalized leave the hospitals at the rate m H . We also assume that once an individual is infected they remain immune to the virus. The removed class (RðtÞ) tracks either the recovered at the rates c I ; c A ; c Q ; c H or those that have died due to COVID-19 at the rates d I ; d A ; d Q ; d H , from the symptomatic, asymptomatic, quarantined, and hospitalized classes, respectively. The equations of the mathematical model are given in Eq. (1).
The associated reproduction number (Diekmann, Heesterbeek & Metz, 1990;van den Driessche & Watmough, 2002) of the COVID-19 model (1), denoted by R 0 , is given by The expression R 0I represents the contribution of the symptomatic infectious individuals to the reproduction, and the expression R 0A represents the contribution to reproduction number due to the asymptomatic individual. The reproduction number, R 0 , is the average number of secondary infectious produced when a single infected individual is introduced into a completely susceptible population (Diekmann, Heesterbeek & Metz, 1990;van den Driessche & Watmough, 2002). Hence, COVID-19 can be effectively controlled in the population if the reproduction number can be reduced to (and maintained at) a value less than unity (i.e., R 0 , 1).

Data fitting and parameter estimation
Some of the parameters of the model (1) were obtained from literature, while others were obtained by fitting the model to the observed cumulative case data for each of the six countries (Australia, Brazil, Italy, South Africa, United Kingdom, and United States) during January-June, 2020 (see Tables A1 and A2 for initial conditions and estimated parameters). The cumulative case data from the respective first index case of each of the countries to June 19, 2020 were obtained from the John Hopkins' center for systems science and engineering COVID-19 Dashboard (Dong, Du & Gardner, 2020). During this time period, these countries instituted lockdowns in March 2020 as a means to control and contain the disease. Italy instituted a lockdown on March 9, Brazil March 17, US March 19, Australia March 23, UK March 23, and South Africa March 26 (WHO, 2020a). Thus, the model was fitted to the two different time periods, the first period is the time before each of the countries instituted lockdown measures to curtail the virus and the second period is after lockdown was in place. We obtained two different sets of parameters for some parameters in each of the time periods; others remained the same, for instance the death rate, the disease progression rate, the proportion of asymptomatic did not change over this time period.
We estimate the remaining five parameters, b; x Q ; x H ; m Q and m H , of the model using the MultiStart algorithm with the fmincon function in MATLAB's optimization toolbox (Burton et al., 2021;Che, Kang & Yakubu, 2020;Che et al., 2021;Edholm et al., 2019;Edholm et al., 2022;Loria, 2018). The fitting was implemented by formulating a least-squares optimization problem with the aim of minimizing the difference between the cumulative cases in each of the six countries and our model's output. The objective function minimized is given as where the vector YC contains the cumulative number of infections obtained from the model and the vector YC Ã contains the corresponding values from the data. Our parameter estimation simulations begin on dates cases were reported in each of the six countries and take daily time steps until the date our data ends, which is June 19, 2020. The values of the initial conditions used for the fitting are given in Table A1. Given a starting point for our objective function J, the fmincon algorithm outputs a local minimum on the surface of J. To help find the global minimum, MultiStart allows us to exhaustively test different starting values throughout our bounded range. We used different starting points, each of which converged to a unique local minimum on the surface of J. Considering the United Kingdom, the smallest objective function value obtained before and after the lockdown are J 0 ¼ 0:17 and J 1 ¼ 0:02, respectively. Figure 2A shows the fitting of the observed cumulative cases for the United Kingdom before lockdown was put in place. The estimated values of the fitted parameters are tabulated in Table 2. The fitting for after lockdown for UK is depicted in Fig. 2B and the estimated parameter values used as well as parameters for the other countries are given in Table A2.
The numerical value of the reproduction number R 0 for United Kingdom before the country's lockdown was put in place is estimated using the parameter values tabulated in Table 2. Consequently, using these parameter estimates, we obtain the value of R 0 for the COVID-19 outbreak in United Kingdom before lockdown as R 0 % 2:95. After lockdown, this value declined to %0.68, with a difference of %2.30.

SENSITIVITY ANALYSIS
In order to assess the relationship between our model parameters, we use the Latin hypercube sampling (LHS) technique, which is a scheme for simulating random parameter sets that adequately cover the parameter space (Blower & Dowlatabadi, 1994;McGreal, 2020;Wang et al., 2013). Uncertainty in model parameters can be identified through the Latin hypercube sampling technique, coupled with partial Rank correlation coefficients (PRCCs). We assume that each uncertain parameter is uniformly distributed within a specified range, which is within AE30% of the respective baseline parameter values, and performed a Latin hypercube sampling analysis by generating 1,000 random samples from the chosen parameter distributions. PRCCs were then calculated for each of the following parameters,  outcome variable (the basic reproduction number, R 0 ). The sign of the PRCCs indicates whether or not changes in the input parameter has a positive or negative effect on the corresponding output variable (Wang, Liu & Heffernan, 2018;Wang, Liu & Liu, 2016).

and the
The most influential parameters of the model are those that have PRCC values that satisfy jPRCCj . 0:4, where a negative sign indicates an inverse relationship. The correlation between the output variable and the input parameters is moderate if 0:2 , jPRCCj , 0:4, and is weak otherwise (Cariboni et al., 2007). Figure 3 indicate that the parameters b; g A ; q; c I ; and c A have the greatest impact on the outcome function (the reproduction number). On one hand, the parameters r; g Q ; c Q ; d A ; d Q ; d I ; x Q and m Q have a moderate impact on the reproduction number (the outcome function). The dominant parameters in increasing the outcome function (R 0 ) are the transmission (b), and the infection modification parameter for the asymptomatic infectious (g A ). On the other hand, the dominant parameters in decreasing R 0 are the proportion of exposed individuals developing infections (q), recovery rate of infectious (c I ), mortality rate of infectious (d I ), the isolation rate of hospitalized and quarantined individuals (x H and x Q , respectively).

SENTIMENT ANALYSIS
In order to carry our the sentiment analysis, tweets from Twitter were downloaded from January 2, 2020 to May 29, 2020 for six countries, namely Australia, Brazil, Italy sentiments about the pandemic. For instance, in the United States, a common sentiment is that the virus is a hoax directed at the ruling party (Oliver, 2020;Waldrop & Gallman, 2020). Media reports about the outbreak are another factor driving the public sentiment but is being construed as fake news by some people. In Brazil, the president accused the press of spreading panic and paranoia (BBC News Services, 2020), and called the virus "a small flu" and urged the people to go to the streets and "face the disease like men" (BBC News Services, 2020; Gray & Shapiro, 2020;Traumann, 2020). Furthermore, the overall sentiment in Brazil is anti-science in nature where the president had once promoted the use of the antimalarial drug, hydroxycholoroquine, as coronavirus treatment drug despite lack of evidence that it was effective against the virus while rejecting social-distancing measures (Fraser, 2020;Gray & Shapiro, 2020).
In our work, we used the following procedure to generate sentiment scores for each country using COVID-19 tweets. Each tweet contains information including a unique tweet identification number (i.e., tweetID) and text up to 240 characters as well as meta-information about the tweet such as user details, geographic origin, user-defined hashtags to categorize the tweet topic, language, and time of creation. In order to maintain consistency of extracting COVID-19 specific content with similar COVID-19 studies using twitter data, the tweetIDs were extracted from a public repository (Chen, Lerman & Ferrara, 2020). These tweetIDs contain validated tweets that include 76 hashtags related to COVID-19 including #COVID-19, #coronavirus, #Corona, #sars-cov-2, #Covid19, #SocialDistancing, #quarantinelife, #covididiot, etc. Figure 4 shows sample tweets downloaded over this time period. As evidenced, negative sentiment may be specific to the disease, expected behaviors, or the behaviors of other people. In part, this may reflect a form of polarization on disease responses.
The process of extracting tweets corresponding to the tweetIDs from the Twitter server is known as hydration, and was carried out by a verified Twitter developer with a valid application programming interface (API). The Twarc hydrator package in python was used to retrieve the tweets with a sleep time of one second between tweets to avoid the 100,000 tweets/day extraction limit set by Twitter. A total of 125.2 million tweets were collected during January 22 to May 29, 2020 time period and stored in a google cloud platform (GCP) server with 8 GB RAM and 2 TB storage. However, less than 1% (983,481 tweets) of the total tweets downloaded was used, as most tweets do not have country indicator because very often users do not associate their account to a country and therefore remain anonymous in the geographic identities, see Table 3 for the country-by-country break down of downloaded tweets.
Post data collection, all tweets were translated to English using the googletrans package. Regular expressions were used for performing data cleaning operations on the tweet texts including removal of special symbols and filtering out URLs. Sentiment scores for all tweets were computed using the textblob package (Ma, 2005). The textblob python package used to compute the sentiment score for tweets in our study adopts a rule-based approach for sentiment quantification based on key indicator words present in them. A tweet is represented as a bag of words. The positive and negative sentiments of a sentences are based on the weighted average of annotated sentiments to each word in a large corpus. The sentiment/polarity score varies from À1 to þ1, and if it it is ,0, we classify the tweet as a negative sentiment tweet, and .0 as a positive sentiment tweet. The tweet scores are then averaged for a country per day. Finally, for each of the above listed countries, the average positive sentiment per day was reported. Figure 5 shows the positive and negative sentiments for the respective countries.
To quantify the overall sentiment in each of the countries, we fitted straight lines (y p and y n for positive and negative sentiments) through these sentiments; and we took the difference of the lines y p and y n to determine if the overall sentiment from a country is positive or negative during the time period the tweets were collected. We see in Fig. 6 that Australia and the United Kingdom are the two most positive countries, their positive sentiment levels were really high. This is followed by Italy and South Africa which had moderately positive sentiment. Brazil and the United States of America have more negative sentiments overall, since they have the least positive sentiment. The European countries, although they were the first to experience a massive wave of the infection, remained relatively positive.

COVID-19 MODEL WITH SENTIMENT EFFECTS
In this section, we incorporated the public sentiments (positive and negative) into the COVID-19 model (1) using the fitted straight lines (y p and y n for positive and negative sentiments). First, we used the results obtained from the sensitivity analysis in "Sensitivity analysis" to determine the form of the sentiment driven functions. Each of these six countries instituted lockdown measures as a way to control the spread of the virus. We expect that as public awareness increases due to increased media coverage of the infection and the lockdown mitigation efforts that public perception and sentiments will be positive, therefore leading to a decrease in disease transmission. We therefore expect the infection rate b to be a decreasing function of public sentiment. However, we see from the sensitivity analysis that the infection rate b would increase the reproduction Figure 5 Positive and negative sentiment of the COVID-19 tweets. Two straight lines y p ¼ a p t þ b p , and y n ¼ a n t þ b n are fitted through the positive and negative tweets. The straight lines for each country are given as (A) the lines y p ¼ 0:0012461 Â t þ 0:32225, and y n ¼ À0:00016767 Â t þ 0:21212 for Australia; (B) the lines y p ¼ 0:00032631 Â t þ 0:18091, and y n ¼ 0:00022551 Â t þ 0:11779 for Brazil; (C) the lines y p ¼ 0:00054929 Â t þ 0:24898, and y n ¼ À0:00030907 Â t þ 0:16079 for Italy; (D) the lines y p ¼ 0:0005727 Â t þ 0:26629; y n ¼ À0:00026964 Â t þ 0:18524 for South-Africa; (E) the lines y p ¼ 0:0012266 Â t þ 0:34568, and y n ¼ À0:0002375 Â t þ 0:22246 for United Kingdom (F) the lines y p ¼ 0:00029309 Â t þ 0:10708; y n ¼ 5:5321e À 06 Â t þ 0:067976 for United States.
Full-size  DOI: 10.7717/peerj.14736/ fig-5 number R 0 . Hence, we define a decreasing sentiment function for this parameter. We also define a decreasing sentiment-related function for m Q and m H since these parameters increase R 0 . However, the parameters x Q and x H are defined as increasing function of the perception-related functions. We have chosen these parameters because these parameters can be influenced by people's behavior, perceptions, and sentiments, unlike the recovery rates, c A ; c I ; c Q ; c H , death rates, d A ; d I ; d Q ; d H , disease progression rate, and the proportion asymptomatic, q. We discuss below how these functions are obtained for each of these parameters. These countries instituted lockdown measures in March 2020 as a means to contain the virus. For instance Italy, Brazil, US, Australia, UK, and South Africa instituted lockdowns on March 9, March 17, March 19, March 23, March 23, and March 26, respectively. Therefore, we define functions that incorporate the values of these parameters before and after lockdown. Starting with the infection rate b, we define the sentiment-related function b M as where b 0 , b 1 are the before and after lockdown infection rates. The variable C I is the cumulative number of symptomatic infectious individuals in the community; these are determined from the following equation. hand, if m . 0, there is increase awareness about the disease in the community and the infection rate could decrease to b 1 ð , b 0 Þ as the number of accumulated infected cases increases as shown in Fig. 7. Thus the public sentiment-related functions for quarantine (x 1M ), hospitalization (x 2M ), quarantine violation (m 1M ), and early hospital discharge rate (m 2M ) are represented by the following functions: Note that m QM ; m HM ; x QM ; x HM . 0 for C I . 0. We assume that m Q1 , m Q0 ; m H1 , m H0 , and x Q1 . x Q0 ; x H1 . x H0 . Furthermore, for arbitrarily small number of symptomatic infectious individuals C I , the sentiment-related transition function m QM converges to m Q0 . 0 for small values of C I the maximum quarantine violation rate out of the quarantine class before the community lockdown. Also, as the cumulative number of infectious individuals C I grows, the quarantine violation function m QM converges to m Q1 , that is, the minimum quarantine violation rate out of the quarantine class as public perceptions and sentiments effects of the infection manifest in the community.
Similarly, the sentiment-related early hospital discharge rate, m HM , from the hospitalized class, converges to m H0 . 0, the maximum early discharge rate for small cumulative number of infectious individuals C I before the onset of public perceptions and sentiments about the disease, and Consequently, for an arbitrarily small cumulative number of infectious individuals C I , the public sentiment-related quarantine and hospitalized functions x QM and x HM converge to x Q0 . 0, and x H0 . 0, the minimum quarantine rates before the onset of public awareness. Also, as the cumulative number of infectious individuals C I gets larger, x QM converges to x Q1 and x HM converges to x H1 . That is, lim C I !1 x QM ¼ x Q1 . 0; and lim C I !1 x HM ¼ x H1 . 0 the maximum number of individuals that are self-isolated or hospitalized, respectively, as a result of media coverage. See Figs. 9A and 9B for the dynamic behavior of functions x QM and x HM . The sentiment parameter m is expressed as m ¼ 1 e ðy p þ y n Þ, where y p is positive sentiments, y n is negative sentiments, and e is a scaling factor that scales the sentiments per 100,000 of the population density. As described above, we fitted two straight lines through the positive and negative sentiments for each of the countries (see Fig. 5) to obtain the sentiment variable y p and y n given as y p ¼ a p t þ b p y n ¼ a n t þ b n ; where a p and a n are the slope of the straight lines and b p and b n are the intercept. Now, incorporating the sentiment-related functions (4), (5), and twitter sentiments (6) into the COVID-19 model (1), we have the following system of differential equations The reproduction number related to model (7) with Twitter sentiment is given as The reproduction number, R 0T , is the average number of secondary infectious produced when a single infected individual is introduced into a completely susceptible population.
Next, we simulated the sentiment-related model (7) using the estimated parameters for each country and plotted in Fig. 10A the cumulative new cases for each of the countries and compare the results to the trajectory of the actual cumulative reported cases in Fig. 10B. We see that the sentiment-related model (7) accurately captures the trajectory of the actual cumulative reported cases; therefore indicating that incorporating public sentiment into an epidemic model is able to capture the trend in the trajectory of the infection in the population. Although the model-simulated cumulative number of cases saturates much earlier than the actual cumulative number of cases; at this point, we are not sure why. Nevertheless, we are able to realize our goal of understanding the role of public sentiment in disease spread since we are not using the sentiment-related model (7) to make prediction about the number of cases.

RESULTS
We begin by analyzing the COVID-19 transmission model with quarantine and hospitalization coupled with public sentiment (described in "COVID-19 model with sentiment effects"). Then we analyze the effect of public sentiment and human behavior on the spread and prevalence of COVID-19 in the community.

Impact of public sentiments on disease transmission
In this section, we explore the impact of public sentiments (positive or negative) on disease transmission in the population. Using sentiment-related functions parameterized with  Fig. 11C that negative public sentiments will yield even more symptomatic infectious individuals in the population, but fewer hospitalized individuals in the population (see Fig. 11D). The result involving the hospitalized, show a counter intuitive result, as one would expect to see more hospitalization with cases. However, with negative sentiment comes mistrust in establishments. Thus, it makes sense if we are seeing fewer infectious individuals seeking hospitalized treatment. During the outbreak in 2020 many individuals in the United States relied on chloroquine and hydroxychloroquine, two drug treatment for malaria as treatment for COVID-19 and would only go to the hospital when they are critically ill (Joseph et al., 2005;Mahmood, 2020; US Food and Drug Administration, 2020e). Thus, the results shown in Fig. 11 suggests that it is important to incorporate public sentiments into epidemic models. Having a clear understanding of the public perception of the risk of the infection and their sentiments regarding a disease outbreak and its transmission is vital for control and mitigation efforts.

Impact of human behavior on quarantine and hospitalization
Next, we explore the impact of quarantine and hospitalization on the number of hospitalized individuals in the population while using the sentiment-related functions parameterized with data from United Kingdom. First, we double the quarantine and hospitalization rates (x Q and x H ). We notice in Fig. 12A that the epidemic curve for the hospitalized individuals increases and the peak of the curve shifts from left to right (as do the time the infection peaks); similarly, the symptomatic infectious individuals shrink ("flatten the curve") while the quarantined population increases and their curves shifts to the right since the rates have been increased. However, when we double the quarantine violation and early hospital discharge rates (m Q and m H ), we see in Fig. 12B the curve of the hospitalized individuals shifts from right to left and the number of hospitalized individuals increases. We see similar shifts in dynamics of the symptomatic infectious and quarantined individuals; however, there are fewer symptomatic infectious and quarantined individuals in the respective classes. It is thus vital to ensure public compliance and adherence with quarantine rules and to promote positive sentiment among the populace, as this will go a long way in flattening the epidemic curve.

DISCUSSIONS AND CONCLUSIONS Discussions
In this study, we developed a novel compartmental mathematical model to study the ongoing COVID-19 pandemic. The model uniquely incorporated human behavior and early discharge from hospital. The model is further coupled with public sentiments about the disease, thereby capturing the effect of disinformation. In particular, the model includes violation of quarantine rules and their positive and negative sentiments regarding the disease. The model also includes discharge of the infected due to overwhelmed hospital facilities. For instance, at the onset of pandemic in England, seniors in hospitals were moved back to care homes (Pawelek, Oeldolf-Hirsch & Rong, 2014;Servick, 2020).
Similarly, at the height of the outbreak in Michigan and New York, hospitals were discharging the non-critically ill either to nursing homes or simply letting them go home because the hospital facilities were overwhelmed (Boucher, 2020;NBC 25 News, 2020;Schnirring, 2020), prompting legislation in Michigan to protect the seniors and vulnerable members of the community and prevent nursing homes from admitting patients with COVID-19 (Newport, 2020). In other places like Arizona, some nursing homes took in COVID-19 patients with mild symptoms (Crenshaw, 2020).
Public awareness and information is one of the factors driving public perception of risk and sentiment about the disease (Harvard Mental Health Lettert, 2020;Ong et al., 2020). At the onset of the pandemic, many people believed (unfortunately, many still believe) that the virus was a hoax, along with wide range of other conspiracy theories about the disease (Andersen, 2020;Cahn, 2020;Galbraith, 2020;Imhoff & Lamberty, 2020;Miller, 2020;Specia, 2020). Many of these misconceptions and disinformation about the disease were spread on social media platforms like Twitter, Facebook, etc., which in turn drives public views, opinions and sentiments about the disease (Alamoodi et al., 2021;Jarynowski, Wojta-Kempa & Belik, 2020). For instance, Jarynowski, Wojta-Kempa & Belik (2020) using Twitter was able to capture in Poland the structural division of the Polish political sphere, identifying the mainstream opposition and protestant groups, and the possible orgin of disinformation in the country. In Brazil, the prevalence of misinformation surrounding the pandemic is deeply concerning and many people blame the messaging from the President Bolsonaro (Gray & Shapiro, 2020).
To measure public sentiments across six countries across different geographical regions, we downloaded tweets from the Twitter platform from January to May 2020. We then carried out sentiment analysis that enabled us to separate the public sentiment into either positive or negative sentiment. While our data set is a multilingual data set across multiple countries, the filter keywords (i.e., hashtags) are mostly in English. Even though 76 hashtags have been considered to extract tweets, there exists a possibility of excluding tweets that pertain to COVID-19 but do not contain any of these hashtags. Even though we have carried out basic data cleaning and processing tasks, we may have overlooked the small proportion of tweets that contain regional phrases expressing irony that may not all have been discovered by the sentiment analysis software program that explores the English translated text. However, a visual inspection of the texts and the sentiment scores for each tweet across the different countries showed that the sentiment scores were representative of the tweet sentiment in most cases, thus ruling out systematic biases in inferences made in this study. The collection, aggregation, and analysis of more than 100 million COVID-19 related tweets generated during the time period ensures representation of the general public sentiment across all sub-populations of each country and not just one region or demographic segment.
Misinformation, disinformation, and conspiracy theories can be really problematic with tremendous impact on public health efforts to contain the disease in the community. However, the use of Twitter tweets to measure public sentiment may be limiting and not present the full picture of public sentiment since public information campaigns might have less impact on society than expected due to filter "bubbles" observed on Twitter (Jarynowski, Wojta-Kempa & Belik, 2020). Hence, it will be beneficial to diversify the sources of public awareness and information in other to reach many people as possible (Jarynowski, Wojta-Kempa & Belik, 2020) and possibly reduce the spread of disinformation.
After obtaining the positive and negative sentiments, we fitted straight lines through the sentiments in order to determine the magnitude of the sentiments in each of the countries, see Fig. 6. We see that United States and Brazil had the least positive sentiment. The level of public sentiments in the United States may be due in part to how polarized the country was in the last 4 years particularly in the months leading to the 2020 presidential elections. United Kingdom and Australia had very positive sentiment overall; in the early days of the pandemic in the UK, the entire country including the royal family applauded the selfless efforts of the health workers and other frontline workers (BBC News, 2020a), sharing clips on social media under the #ClapForCarers hashtag (Aljazeera, 2020;Saini, 2020).
On March 11, 2020 the World Health Organization (WHO) declared the novel coronavirus a global pandemic (WHO, 2020c) and shortly thereafter many countries imposed travel bans from many hotspots regions, and instituted lock-down measures in a bid to curtail and contain the spread of the virus. To incorporate public sentiment into our COVID-19 model (1), we segmented the time period into before and after lockdown. We used results obtained from the sensitivity analysis (see Fig. 3) to informed the nature of the different parameters (see Figs. 7-9) that can be influence by public sentiment. These parameters were then defined as increasing and decreasing functions of pubic sentiment which were incorporated into COVID-19 model (1). These parameters consist of before and after lockdown related parameters which we estimated using data are obtained from Johns Hopkins website (Dong, Du & Gardner, 2020) and some parameter values from literature (see Table 2).
The reproduction numbers for before and after lockdown (R 0 0 and R 0 1 ) for the respective countries are shown in Table A2. All the countries had reproduction number above one before lockdowns were put in place, and the values were below one after lockdown except for South Africa with estimated value of R 0 1 ¼ 1:52. This value aligns with the estimated value by the South African National Institute for Communicable Diseases (South African National Institute for Communicable Diseases, 2020). The reproduction number estimated for South Africa by the National Institute for Communicable Diseases at the onset of the outbreak was between 1.7 and 2.5; these numbers reduced substantially but was still above one following measures such as flight restrictions into the country, school closures, and national level 5 lockdown in mid-March 2020. Some provinces like Western Cape Province had estimated reproduction number of 1.5-1.7 by mid-late April 2020, while other provinces like Gauteng, KwaZulu Natal, and Eastern Cape Province had estimated reproduction number of 1.0-1.5 by mid-late April 2020 indicating an ongoing transmission or steady disease progression (South African National Institute for Communicable Diseases, 2020).
Our results showed that preventing the spread of disinformation and negative sentiment about the disease in the community is important (see Fig. 11). Thus, it is essential to prevent disinformation, and to promote positive sentiment in the community. It is equally vital to ensure public compliance and adherence with quarantine rules and all mitigation efforts (see Fig. 12). Doing so will go a long way in flattening the epidemic curve, and will lead to the kind of success story observed in New Zealand (Shepherd, 2020;Wikipedia, 2020).
Our study demonstrated that the countries with positive sentiment, and quarantine compliance have been more successful at curtailing the spread of the disease. In addition, we have been able to demonstrate the impact on disease burden of early discharge of symptomatic infectious individuals from hospital to make room for incoming sever COVID-19 patients. Overall, our model is able to demonstrate the role of people's behavior and public sentiment on disease transmission. Although, the trajectory of model simulation in Fig. 10A is able to capture the trend of the actual trajectory of the cumulative number of cases in Fig. 10B, our simulation results saturate much earlier. A number of factors may be responsible for this, for instance, non ascertainment of all infected cases. According to CDC (Centers for Disease Prevention and Control, 2020b), asymptomatic individuals can account for between 15% to 70% of cases which in reality are not tracked nor documented. Note that our model in Fig. 1 incorporate the asymptomatic individuals this may be the reason for the difference between the simulated outcome on the case data.
Since we started this study, the number of cases in these countries has exploded, with some experiencing multiple waves of infections (WHO, 2020b) put in another lock-down (France, Germany, Italy, and the United Kingdom (BBC News, 2020bLevy et al., 2017;Meloni & Hutchinson, 2020;Savage, 2020)). Although we did not evaluate the sentiment after the lock-down was lifted, we observed a wave of protests against other mitigation efforts like the use of face-mask and vaccines in many of these countries such as US, UK, Australia, Italy, and Canada (Drury, 2020;McGee, Reynolds & Cullen, 2020;Reuters, 2020;Rinke & Kar-Gupta, 2020). We believe these protests are driven by negative sentiments in the society against the use of face-masks which subsequently increases the number of infection as we observed in Fig. 11.

CONCLUSION
To conclude, this study develops a novel model for COVID-19 that uniquely incorporates human behavior driven by their perception of risk and sentiments about the disease.
The goal of this study was not to make explicit epidemiological predictions about the disease; rather we hope to provide insight into effects of human behavior on non-pharmaceutical intervention strategies (such as self-isolation and quarantine) aimed at containing the disease and public sentiments about the disease. The key findings from this study are summarized below.
The simulations of the COVID-19 model (7) with human behavior and public sentiment about the disease show that: i) Incorporating public sentiment into an epidemic model is able to project the trajectory of the disease incidence in the community.
ii) Positive sentiments among individuals in the population reduces the number of infected and disease burden in the community.
iii) Negative sentiments among individuals in the community amplify the disease burden in the community.
iv) Increasing quarantine, and hospitalization rates decreases the disease burden and reduces epidemic peak.
v) Increased quarantine violation rate and early discharged of those still infectious due to overwhelmed hospital resources increases disease burden leading to early epidemic peak.
This study has shown that incorporating human behavior and public sentiment into epidemic models is pertinent in order to accurately capture the dynamics and burden of the disease in the community. We have seen the role quarantine violation plays in disease spread; in a future study, we will incorporate other kinds of mitigation efforts such as vaccination and public reactions about them. Aside for incorporating mitigation efforts, in our future model we will consider the hospital capacity in terms of the number of bed. At the height of the outbreak a number of hospitals both in urban and rural areas exceeded their capacity to accommodate infected individuals.

A PARAMETER ESTIMATION FOR THE SELECTED COUNTRIES
Initial values for our simulations are given in Table A1, it include the population of the countries N(0) and the exact cumulative value C(0) from the data. The initial values of E 0 , A 0 , I 0 , H(0), and R(0) to ensure the fit of the trajectory of each country. The initial values are summarized below: